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ABSTRACT 


Applications  of  the  general  linear  model  in  experimental 
design  and  analysis  usually  involve  design  matrices  of  less 
than  full  column  rank.  This  may  present  a  problem  in  deter¬ 
mining  what  elements  and  functions  of  the  parameter  vector 
are  estimable  and  what  hypotheses  are  testable.  This  thesis 
discusses  two  methods  of  answering  questions  about  estima- 
bility  and  testability,  where  the  form  of  the  design  matrix 
determines  the  method  to  be  used.  The  two  methods,  both  of 
which  can  use  com.puter  routines,  are:  (l)  direct  mathematical 
computational  approach,  and  (2)  a  modification  of  an  analysis 
of  variance  routine,  with  a  special  case  of  this  method  using 
a  modified  ANOVA  routine  and  solutions  to  systems  of  linear 
equations.  Confounding  of  effects  is  developed  mathematically 
in  connection  with  determining  estimable  functions.  Methods 
discussed  in  this  thesis  can  be  applied  to  the  area  of  Army 
Test  and  Evaluation. 
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I.  INTRODUCTION 


The  model  of  interest  is  the  general  linear  model: 

/V  ^  ^ 

y  =  Xb  +  e 

where  y  is  an  nxl  vector  of  observations, 

X  is  an  nxp  matrix  of  known  values, 
b  is  a  pxl  vector  of  parameters, 
and  e  is  a  vector  of  random  error  terms. 

Throughout  this  thesis  a  capital  letter  will  denote  a  matrix, 
with  a  prime  or  superscript  of  "-1"  representing  its  transpose 
or  inverse  respectively.  A  lower  case  letter  with  a  tilda 
above  it  will  denote  a  vector  with  a  prime  again  representing 
its  transpose.  This  model  is  used  in  regression  analysis  and 
design  analysis  /~Ref.  10_7.  The  derivation  of  a  least  squares 

A/ 

estimator  for  b  involves  minimizing  the  sum  of  squares  of  the 
difference  betv;een  the  observations  and  their  expected  values. 
This  problem  can  be  further  reduced  to  the  problem  of  solving 
the  normal  equations 

X-X^  =  X'y,  (1) 

in  v^hich  b  is  the  estimator  of  b.  In  the  case  where  X  is  of 
full  rank  p,  a  unique  solution  to  the  normal  equations  exists. 

A  solution  to  the  normal  equations  may  then  be  written  in  form 

^  -T  « 

b  =  (X'X)  X’y  (2) 
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A 

The  solution  b  found  under  these  conditions  can  be  shown  to 
be  the  best  linear  unbiased  estimator  (b.l.u.e.)  of  b.  Refer¬ 
ences  4,  9>  and  10  contain  proofs  of  these  statements. 

If  the  model  is  not  of  full  rank,  then  (X'X)”^  does  not 
exist  and  a  solution  to  the  normal  equations  may  be  written 
in  terms  of  a  generalized  inverse  of  X’X.  A  generalized  inverse 
G  of  a  matrix  A  is  defined  to  be  any  matrix  having  the  property 
that  AGA  =  A.  The  linear  model  with  a  matrix  X  that  is  not  of 
full  column  rank  can  arise  in  various  ways.  For  example,  the 
nature  of  the  experiment  may  result  in  a  design  matrix  X  that 
is  not  of  full  column  rank,  or  the  experimenter  may  have  had 
problems  that  caused  the  experiment  not  to  be  conducted  in 
accordance  with  an  original  full  rank  design.  In  this  case 
an  analyst  may  wish  to  perform  a  "salvage  operation"  to  gain 
as  much  information  as  possible  from  the  data  derived  from 
the  experiment.  In  any  case,  the  analyst  is  faced  with  these 
questions  j 

(1)  What  is  the  actual  form  of  the  experiment  conducted 
( i.e.  ,  v/hat  is  X)? 

■  ( 2)  What  inferences  can  be  made  from  the  information 

attained  (i.e.,  what  effects  (elements  and  functions 
of  b)  are  estimable  and  what  hypotheses  are  testable)? 

With  the  less  than  full  rank  model,  the  solution  to  the 
normal  equations  is  not  unique;  rather,  many  solutions  exist. 
Based  on  a  generalized  inverse  G  of  X'X,  a  solution  to  (2) 
may  be  written  in  the  form 

b°  =  GX'y  (3) 
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For  each  generalized  inverse  G,  discussed  in  detail  hy  Pringle 
and  Rayner  /~Ref.  8_7,  there  is  a  solution  h°  given  by  (3), 
and  conversely  (i.e.,  all  solutions  can  be  expressed  in  the 
form  of  (3))»  Searle  /”Ref.  10_7  states  that  the  normal  equa¬ 
tions  are  consistent  and  that  the  solutions  to  (1)  are  given 
by  (3)  if  and  only  if  G  is  a  generalized  inverse  of  X’X. 

This  thesis  develops  and  discusses  several  methods  for  an 
experimenter  to  determine  the  estimability  and  testability  of 
linear  functions  of  the  parameter  vector.  First,  a  mathemat¬ 
ically  straightforward  solution  to  these  problems  will  be 
outlined.  Then,  the  use  of  an  analysis  of  variance  computer 
routine  for  implementation  of  this  solution  is  discussed. 
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II.  DEVELOPMENT 


To  answer  questions  about  testability,  several  topics 
must  be  discussed,  including  the  generalized  inverse  of  X'X 

A# 

and  estimability  of  a  linear  function  of  b. 

A.  GENERALIZED  INVERSE  MATRIX 

An  introduction  to  the  theory  of  generalized  inverses  is 
contained  in  Searle  ^Ref.  10_7.  A  detailed  discussion  is 
available  in  Pringle  and  Rayner  /~Ref.  8_7.  Throughout  this 
paper  the  symbol  G  will  be  used  to  represent  a  generalized 
inverse  of  X'X.  As  discussed  in  Reference  8,  the  matrix  G 
has  many  alternate  names  such  as  "pseudo-inverse,"  "conditional 
inverse"  and  "g- inverse,"  which  makes  identical  information 
available  in  the  literature  under  several  different  names. 
Searle  presents  the  following  important  properties  of  a  gen¬ 
eralized  inverses 

Theorem  1.  VJhen  G  is  a  generalized  inverse  of  the  matrix 
X'X,  then 

1.  G'  is  also  a  generalized  inverse  of  X'X; 

2.  XGX'X  =  X;  GX'  is  a  generalized  inverse  of  X; 

3.  XGX'  is  invariant  to  G. 

Various  methods  for  computing  a  generalized  inverse  are 
described  in  detail  by  Searle  and  by  Pringle  and  Rayner.  The 
properties  of  G  stated  in  Theorem  1  are  important  for  the 
derivations  in  this  thesis,  but  the  direct  computation  of  G 
will  not  be  required  for  reasons  described  in  the  next  chapter. 
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B.  EXPECTED  VALUES  AND  THE  MATRIX  H 


Since  X’X  is,  in  general,  not  of  full  rank,  and  thus  equa- 

A 

tion  (2)  cannot  be  solved  for  a  unique  solution  b  =  (X'’X)“^X'y 
as  in  the  full  rank  model,  the  normal  equations  for  the  less 
than  full  rank  model  are  written  as 
X'Xb°  =  X«y 

•  •  ^  o 

in  which  b  denotes  any  one  of  the  many  solutions  that  exist. 
Letting  G  denote  a  generalized  inverse  matrix  of  X'X,  then  the 
corresponding  solution  is  given  by 

b°  =  GX'y  (4) 

The  expected  value  of  b  is  given  by 
E(b°)  =  GX'E(y)  =  GX'Xb  =  Hb; 

where  H  =  GX’X.  According  to  Searle  /~Ref.  10_7,  the  matrix 
H  is  unique  although  G  is  not.  For  this  case,  note  that  b 

^  #V  Al 

IS  an  unbiased  estimator  of  Hb,  rather  than  of  b. 

C.  ESTIMABLE  FUNCTIONS 

Searle  /~Refo  10_7  states  that  a  linear  function  of  para- 

A* 

meters,  in  this  case  of  components  of  b,  is  estimable  if  it 
is  identically  equal  to  some  linear  function  of  the  expected 
value  of  y,  the  vector  of  observations.  In  other  words,  q'b 

^  A#  ^  A/ 

is  estimable  if  and  only  if  q'b  =  t'E(y)  for  some  vector  t' . 

A/ 

The  vector  t'  may  not  be  unique. 
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There  are  several  important  properties  of  estimable 
functions  that  will  he  necessary  in  determining  which  hypoth¬ 
eses  are  testable  (see  Reference  10). 

1.  Linear  combinations  of  estimable  functions  are 
estimable . 

2.  All  estimable  functions  are  linear  combinations  of 
Xb.  The  expected  value  of  y,  E(y) ,  is  equal  to  Xb.  By 
definition,  if  q’b  is  estimable,  then  q'b  =  t'E(y)  for  some 
t* ,  so 

q'b  =  t'Xb.  (5) 

The  concept  of  estimability  does  not  depend  on  the  value  b, 
so  equation  (5)  must  be  true  for  all  values  of  b.  Therefore 

VJe  thus  arrive  at  the  following  important 

^  ^  A/ 

q'b  is  estimable  if  and  only  if  q'  =  t'X  for  some  t' . 

3.  When  q'b  is  estimable,  q'b  is  invariant  to  whatever 
solution  b  of  the  normal  equations  is  used.  This  is  true 
because,  by  the  previous  property,  for  some  t' , 

q'b°  =  t’Xb°  =  t’XGX’y 

and  XGX'  is  invariant  under  selections  of  G  (Theorem  1).  Thus, 
q'b°  is  invariant  under  choices  of  G  and  hence  to  b°,  when 
q'b  is  estimable. 


q'  =  t'X 


for  some  vector  t'  . 
characterization ; 
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4-.  The  following  theorem  /~Ref.  loJ7  provides  a  procedure 
for  checking  the  estimability  of  q’b. 

Theorem  2.  A  given  function  q’b  is  estimable  if 
and  only  if  q’H  =  q’ . 

Proof;  If  q'b  is  estimable,  then  the  definition 
of  estimability  implies  that  q’  =  t'X  for 
some  t' ,  and  q’H  =  t’XH  =  t’XGX’X  =  t’X 
by  Theorem  1.  If  q’H  =  q’ ,  then 
q’  =  q'GX'X  =  t'X  for  t'  =  q’GX. 

D.  TESTABLE  HYPOTHESIS 

A  testable  hypothesis  is  a  hypothesis  that  can  be  expressed 
in  terms  of  estimable  functions.  Assume  that  a  null  hypothesis 
takes  the  form 

q'b  =  0.  (6) 

IV 

If  the  null  hypothesis  q'b  =  0  is  to  be  tested  by  the  analyst, 
then  q’b°  will  be  part  of  the  test  statistic  which  will  need 
to  be  invariant  to  b  (detailed  proof  contained  in  Ref.  10). 

As  discussed  earlier,  q’b  is  invariant  if  q'b  is  estimable. 

Thus  by  applying  Theorem  2,  if  q'H  =  q'  then  q'b  is  estimable 
and  H^;  q'b  =  0  is  a  testable  hypothesis.  Similarly,  if 
q’H  ^  q' ,  then  q'b  is  not  estimable,  and  therefore  the  hypothesis 
q'b  is  not  a  testable  hypothesis.  In  the  testing  of  a  hypo¬ 
thesis,  Searle  /~Ref.  10_7  proves  that  if  an  analyst  uses  the 
standard  analysis  of  variance  procedure  to  "test"  the  hypothesis 
given  by  equation  (6)  when  in  fact  q’b  is  not  estimable,  the 
actual  hypothesis  tested  is 
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H’  :  q'Hb  =  0 
o  ^ 

The  discussions  concerning  estimability  of  q'b  and  the 

A/ 

testing  of  the  hypothesis  H^:q'b  =  0  also  can  be  applied  in 
the  form  of  Q'b  for 

Q'b  =  {qj^l>}  for  i  =  1,  .  .  ,,  s, 

^  A/  I 

where  b  is  a  pxl  vector,  q^  is  Ixp,  and  Q'  is  a  sxp  matrix. 
Thus  Q'b  is  estimable  if  and  only  if  Q'H  =  Q' ,  and  the  hypo- 

Al  A^ 

thesis  ;  Q'b  =  0  is  testable  if  Q'b  is  estimable. 

The  next  chapter  describes  methods  of  computing  H,  and 
the  use  of  the  matrix  H  will  be  demonstrated  in  examples  in 
Chapter  IV . 
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III.  METHODS  OF  DETECTING  THE  MATRIX  H 


Determination  of  the  matrix  H  is  useful  in  answering 
questions  concerning  estimable  functions  and  testable  hypoth¬ 
eses.  As  discussed  in  Chapter  II,  the  function  q’b  is  esti¬ 
mable  if  and  only  if  q"H  =  q’,  and  the  hypothesis  H^:  q’b  =  0 
is  testable  only  if  q’b  is  estimable.  Recalling  that  H  =  GX'X 
where  G  is  a  generalized  inverse  of  the  matrix  X’X,  the  pro¬ 
blem  indirectly  becomes  one  of  computing  the  matrix  G  and 
then  perfomiing  the  matrix  multiplication  to  obtain  GX’X. 

For  many  designs  encountered  in  practice,  the  matrices  X,  X’, 
and  G  are  of  large  dimensions.  Two  approaches  for  computing 
H  v/ill  be  discussed  and  demonstrated  by  examples.  The  first 
approach  will  be  the  straightforward  mathematical  approach; 
but  the  most  practical  method  appears  to  be  one  utilizing  an 
analysis  of  variance  (AH OVA)  computer  program.  The  approach 
using  ANOVA  computer  programs  is  practical  because  the  analyst 
would  presiimably  be  using  a  program  of  that  type  to  analyze 
test  data  anyway.  As  will  be  discussed,  with  a  simple  modifi¬ 
cation  of  program  inputs,  the  matrix  H  can  be  computed  using 
the  ANOVA  program. 

A.  DIRECT  MATHEMATICAL  COMPUTATION  OF  MATRIX  H 

Although  the  direct  approach  is  demonstrated  by  a  sample 
problem  with  a  relatively  small  dimensional  design  matrix  X, 
this  approach  provides  insights  into  determining  estimability . 
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For*  Ga.s0  of  discussion,  "th©  ssinpl©  problGrn  will  P@  prGSGnfGd 
as  a  stGp-by-stGp  procGdurG. 


For  this  GxamplG,  tho  modGl  can  b©  writt©n  in  th©  form 


Yij  -  m  +  ai  +  ©^^., 
In  matrix  form  this  b©com©s 


for  i  =  1,  2,  and  j  =  1,  2. 


1 

o\ 

hll\ 

yi2 

' 

1 

1 

0 

^m  \ 

®12 

rr 

+ 

yji 

1 

0 

1 

1 

1 

®21 

1^22, 

\l 

0 

f 

\^22j 

1 .  Com'pute  X’X 

As  d©scrib©d  in  th©  Chapt©r  II,  th©  transpos©  product 
of  X  is  d©sir©d  throughout  th©  computations. 


X'X  = 


A  1 
110 
10  1 
^10  1 


(k  2  2\ 

2  2  0 
2  0  2 


\ 


/ 


Note  that  th©  rank  of  X'X  is  2. 

2.  Compute  G 

Several  methods  are  available  to  describe  the  compu¬ 
tation  of  the  generalized  inverse  of  a  matrix  /~Refs.  8  and 
10_7.  The  method  in  this  example  is  described  by  Searle 
/~Ref.  10_7.  The  order  of  X'X  is  3  and  its  rank  is  2,  so 
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the  first  step  is  to  delete  (3-2)  rows  and  corresponding 
columns  from  X’X,  to  leave  a  suh-matrix  of  full  rank  2 
called  (X'X)^.  For  this  example  the  first  row  and  column 
are  deleted  to  yield 


(X’X) 


n 


12  0 
0  2 


%  0 


and  (X’X) 


-1  ^ 
n 


.0  i 


Thus  G,  a  generalized  inverse  of  X’X,  is  determined  hy 
replacing  all  elements  of  (X’X)^  by  those  of  its  inverse 
and  putting  in  zero  for  all  other  elements  of  X’X.  Applying 
this  procedure  yields 


As  a  check  to  see  if  G  is  a  generalized  inverse  of  X’X, 
determine  if  X’XGX’X  =  X’X: 


X’XGX®X  = 


0 

°\ 

r 

2 

0 

1 

2 

2 

2 

\o 

0 

*/ 

(2 

0 

=  X'X, 
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3 .  Compute  H 

The  computation  of  H  is  a  direct  matrix  multiplication 


H  =  GX'X  = 


1° 

0 

°\  / 

II 

0 

1 

2 

0 

*/  1 

h 

2 


2 

2 

0 


2 

0 

2 


r 

0 

1 

1 

0 

/ 

u 

0 

1/ 

(7) 


4 .  Compute  H  Using  Alternate  Method 

Two  equations  are  useful  in  demonstrating  an  alternate 
method  for  computing  H; 

H  =  GX'X 


b°=  GX'y,  for  any  y. 


.  th 


/W  •  .  ^  ^  , 

If  the  y  used  for  computation  is  x.  where  x.  is  the  j 

J  J 

column  of  the  design  matrix  X,  then  a  solution  h.  can  he 

J 

computed  for  each  j.  Partitioning  the  matrices  X  and  H,  the 
following  result  is  noted: 


/V  o  !  (V  0  '  '  ^  o.  ,  ,<v  1  1  1 

H  =  ( h-,  I  h^  I  .  .  •  1  h  )  =  GX ’  ( X,  I  x„  i  ...  i 

JL  \  ^  i  *  P  X  I  ^  I 


rJ 

.  P 


'  X  )  =GX'X. 


For  this  thesis  the  columns  of  the  matrix  X,  x.,  will  he 

J 

referred  to  as  "pseudo-data." 
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To  demonstrate  this  procedure  of  computing  H,  the 
previous  example  v/ill  be  used.  The  solution  becomes  one  of 
computing 

'VO 

for  j  =  1,  2,  3.  (8) 


^V 

^3 

0 

=  GX'x 

/V  Q 

rJ 

h 

GX 

•  X 

^1 

/o 

0 

o\ 

i 

= 

0 

1 

2 

0 

0 

0 

'l  1  1  l\ 

110  0 
0  0  11/ 


1^ 

/o\ 

f  \ 
1  ] 

,= 

1 

1 

ll/ 

similarly, 


bg  =GX-X3 


and 


>^0  ~ 
b^  =  GX'x^  = 


o\ 

0 

1 


nj  o  !  Aj  o  !  ''*  o 
Thus  H  =  (bn  I  b,n  I  bo  ) 

-L  »  ^  I  J 


0 


0  o\ 


1  1 
1  0 


,  as  before. (9) 


Of  course  H  computed  using  this  method  is  the  same  as  the 
H  determined  in  paragraph  3-  This  method  of  computing  H  is 
important  because  an  ANOYA  computer  routine  can  be  used  to 
solve  for  the  coliiirms  of  H,  using  pseudo-data  in  place  of  y. 
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B.  USE  OF  MOYA  ROUTINE  TO  COMPUTE  THE  MATRIX  H 


In  section  A  two  methods  were  discussed  for  computing  H. 

The  simplicity  of  the  design  matrix  X  used  in  the  example 
permitted  easy  computations.  A  more  realistic  design  matrix 
in  practice  would  he  of  large  dimensions  that  such  hand  compu¬ 
tations  might  become  impractical.  Computers  provide  a  means 
of  solving  this  problem.  One  approach  on  a  computer  would  be 
a  direct  mathematical  approach  using  available  computer  routines 
for  computing  G,  then  H.  Three  routines  would  be  required  to 
compute  H:  one  routine  to  compute  X*X;  a  second  routine  to 
compute  the  generalized  inverse  of  the  matrix  X*X;  and  a  third 
routine  to  compute  the  matrix  product  of  GX'X  to  attain  H. 

This  method  is  straightforward  but  requires  detailed  user 
knov/ledge  by  the  analyst.  The  limitations  of  this  approach 
are  based  on  the  limitations  of  the  computer  routines  used, 
which  norma.lly  are  in  the  form  of  dimension  capacities. 

A  second  approach  using  an  ANOVA  computer  routine  may  be 
more  practical  for  an  analyst,  since  an  ANOVA  routine  will 
probably  be  used  to  analyze  the  data  from  the  experiment  any¬ 
way.  The  procedure  to  be  used  involves  the  use  of  pseudo¬ 
data  as  described  in  step  4  of  section  A .  Pseudo-data  is 
analyzed  by  the  routine,  with  the  results  of  interest  being 
the  regression  coefficients  as  determined  by  solving  the 
normal  equations  (equation  (1))  under  the  full  model  hypothesis. 
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This  procedure  can  be  demonstrated  by  an  example  using 
the  General-Linear  Hypothesis  routine,  BMDO5V,  of  the  BMD 
Biomedical  Computer  Programs  /~Ref.  11_7.  Suppose  that  the 
model  is 


^ij  ^i  ^^ij  ®ij'  i  =  1,  2; 

j  =  1,  2,  3. 

In  matrix  form  this  model  can  be  written  in  the  general  form 


yi2 
^13 
^21 
^22 
.^23  , 

V  V 


/. 


110100100000 

110010010000 

110001001000 

101100000100 

101010000010 

101001000001 


7 


/  m  ) 

®ii 

^1 

®12 

^2 

4- 

®13 

( 

®21 

^2 

®22 

C3 

.^23, 

ac 


11 


ac 


12 


ac 


ac 


ac 


13 

21 


22 


(10) 


Each  column  of  the  design  matrix  is  entered  as  pseudo-data. 
The  computer  output  lists  estimates  of  coefficients  corres- 

A/ 

ponding  to  the  full  model  hypothesis.  This  gives  b.°  for  the 
column  .of  pseudo-data.  Using  the  smaller  dimensional 
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notation  required  ty  the  BMDO5V  routine  causes  the  coefficient 
vector  to  be  in  a  reduced  form.  This  can  be  expanded  to  the 

a;  O 

full  b.  by  applying  the  "usual  linear  restrictions"  on  the 

J 

parameters  given  by 


3 


c . 


0 


=  0 


etc . 


These  linear  restrictions  are  common  restrictions  placed  on 
the  parameters  in  ANOVA  problems,  since  by  using  these 
restrictions,  the  "reduced"  design  matrix  may  become  of  full 
rank,  which  in  turn  aides  in  solving  the  normal  equations. 

The  matrix  H  can  then  be  formed  as  previously  discussed  by 
taking 

/  o  1  I  'v  o  . 

H  =  (b^  :  . . .  I  b^2 ) • 

Using  the  design  matrix  from  equation  (lO)  and  the  BMD05V  rou¬ 
tine,  this  procedure  was  followed  to  determine: 
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fl 

1/2 

1/2 

1/3 

1/3 

1/3  1/6  1/6  1/6  1/6  1/6  1/6^ 

0 

1/2 

-1/2 

0 

0 

0  1/6  1/6  1/6  -1/6  -1/6  -1/6 

0 

-1/2 

1/2 

0 

0 

0  -1/6  -1/6  -1/6  1/6  1/6  1/6 

0 

0 

0 

2/3 

-1/3 

-1/3  1/3  -1/6  -1/6  1/3  -1/6  -1/6 

0 

0 

0 

-1/3 

2/3 

-1/3  -1/6  1/3  -1/6  -1/6  1/3  -1/6 

0 

0 

0 

-1/3 

-1/3 

2/3  -1/6  -1/6  1/3  -1/6  -1/6  1/3 

0 

0 

0 

0 

0 

^  - 

0  1/3  -1/6  -1/6  -1/3  1/6  1/6 

0 

0 

0 

0 

0 

0  -1/6  1/3  -1/6  1/6  -1/3  1/6 

0 

0 

0 

0 

0 

0  -1/6  -1/6  1/3  1/6  1/6  -1/3 

0 

0 

0 

0 

0 

0  -1/3  1/6  1/6  1/3  -1/6  -1/6 

0 

0 

0 

0 

0 

0  1/6  -1/3  1/6  -1/3  1/3  -1/6 

0 

0 

0 

0 

0 

0  1/6  1/6  -1/3  -1/6  -1/6  1/3^ 

As  a  check,  this  matrix  H  satisfies  the  requirement  that 
X'XGX’X  =  X'XH  =  X'X. 

If  the  transpose  product  of  the  design  matrix  used  as 
input  for  the  BMD05V  routine  is  singular,  then  the  BMD05V 
routine  will  terminate  with  an  error  statement.  In  such  a 
case,  the  design  matrix  must  be  modified  so  as  to  accomplish 
the  required  results. 

To  outline  the  procedure  in  such  a  case,  consider  the 
model 

y.  =  m  +  a.  +  c .  +  d.  +  acd.  +  cd +  ad.,  + 

•'ijk  1  j  k  ijk  jk  ik 

$ 

i  =  1,  2:  0  =  1,  2;  k  =  1,2. 
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This  model  is  a  2p  factorial  design,  and  for  this  example 
there  will  be  observations  in  only_ four  of  the  eight  cells 
(•g-  replication)  .  Suppose  observations  are  numbered  and 
placed  as  shown  below: 


The  cells  containing  numbers  represent  the  locations  of  the 
observations.  The  transpose  of  the  design  matrix  X  and  the 
parameter  vector  b  for  this  model  are  shown  in  figures  1  and 
2.  The  first  stop  in  computing  H  will  be  to  determine  the  form 
of  the  reduced  design  matrix  Z,  which  would  be  required  by 
the  BMD05'^  routine .  Thus  the  form  would  be 


1  1 
1  0 
0  1 
0  0 


111 
oil 
0  10 
110 


1  1 

0  0 

1  0 

0  1^ 


This  design  matrix  Z  will  not  run  on  BMD05V  because  the  product 
Z’Z  is  singular.  To  reduce  Z  to  a  form  acceptable  to  the 
computer  routine,  determine  a  set  of  linearly  independent 
columns  of  Z  and  write  these  columns  as  the  matrix  A.  For 
this  example,  the  first  four  columns  suffice: 


2^ 


X' 


\ 

1111 

110  0 
0  0  11 
10  10 
0  10  1 

10  0  1 

0  110 
10  0  0 

0  0  0  0 

0  0  0  0 

0  10  1 

0  0  0  0 

0  0  10 

0  0  0  1 

0  0  0  0 

10  0  0 

0  0  10 

0  0  0  1 

0  10  0 

10  0  0 

0  10  0 

0  0  0  1 

0  10  0 

10  0  0 

0  10  1 

0  0  10 

10  0  0 


FIGURE  1. 


Transpose  of  Design  Matrix  for  ^  Replication  of 
a  2-^  Factorial  Design. 
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FIGURE  2 


a; 

b  = 


/ 


\ 


m 

a. 


acd 


111 


acd 


112 


acd 


121 


acd 


122 


acd 


211 


acd 


acd 


212 


221 


acd 


222 


cd 


11 


cd 


12 


cd 


21 


cd 


22 


ad 


11 


ad 


12 


ad 


21 


ad 


22 


ac 


11 


ac 


12 


ac 


21 


ac 


22 


y 


Parameter  Vector  for  Replication  of  a  2' 
Factorial  Design. 
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1111 


A 


1  0 
0  1 
0  0 


0 

0 


Note  that  Z  is  of  the  form  Z  =  (A  I  A)  .  Since  A  is  nonsin^jular , 
so  is  A'A,  and  the  inverse  of  A'A  exists.  Let  =  (A’A)~^ 
cind  (A'A)  =  (A'A)~^(A'A)  =  I.  The  remaining  steps  are 

to  compute  G^ ,  the  generalized  inverse  of  Z'Z  and  then  to 
compute  H2 .  Observe  that 


/  A'\ 

Z'Z  =  (A  A)  = 

\A'/ 


A'A 


A'A 


A  'A 


A’A 


Using  the  fact  that  A'A  G^  A'A  =  A'A,  then  G^  is  of  the  form 


The  matrix  G^  possesses  the  property  that  Z'Z  G^  Z’Z  -  Z'Z, 
so. 


=  Gg  Z'Z  =  4 


A’A 

A’A 


A'A 

A'A 


2 


/V"  i 

1 

2 

I 

where  I  is  the  4x4  identity  matrix.  This  is  in  a  reduced 
form  and  thus  must  be  expanded  from  its  8x8  dimension  to  the 
27x27  H  matrix.  Additional  columns  can  be  added  (i.e.,  rows 
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expanded)  based  on  the  fact  that  each  of  the  remaining  columns 
can  be  expressed  as  a  linear  combination  of  the  matrix  , 
where  consists  of  the  first  four  columns  of  .  The 
row  expansion  of  will  produce  the  8x2?  dimensional  matrix 
^FZ  *  expand  the  rows,  recall  that  each  column  of  the 

design  matrix  X,  denoted  by  x . ,  for  j  =  5>  . . .27,  can  be 

J 

expressed  as  a  linear  combination  of  the  matrix  A  (the  first 
four  columns  of  X) .  This  linear  combination  can  be  expressed 
as 


=  x^. ,  for  j  =  5,  .  .  .  ,  2?. 

Since  A  is  nonsingular,  A”^  exists  and 


V.  =  A  ^x.,  for  j  =  5,  .  .  .  ,  2?, 

J  J 


The  same  linear  relationship  exists  between  each  column  of 

~  th 

^R2*  Letting  h^  be  the  j  column  of  and  x^  be 

the  corresponding  column  of  pseudo-data  from  the  design  matrix 

X  (Figure  1),  h.  can  be  computed  as 

J 

h.  =  for  j  =  5>  •  •  •  ,  27  in  this  example. 

J  J 

Each  column  can  be  expanded  (i.e.,  additional  rows  added)  by 
applying  the  same  restraints  as  before,  in  the  form: 

Z  a.=  I  c.  =  =  0 

i  .1  K  1  J 


=  |=djk  =  tad.^  =  0 

(12) 

and 

Lad.,  = 
k 

y ac . .  = 

i 

Xac  •  •  =  0 . 

^  iJ 

J 
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Using  this  procedure  as  described,  the  27x27  H  for  the 
■g-  replication  of  a  2^  factorial  experimental  design  was 
obtained.  It  is  shown  in  Figures  3  and  4  where  H  =  (H^ {  H^) . 
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FIGURE  3.  The  Matrix 
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FIGURE  4.  The  Matrix 
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IV.  APPLICATIONS  OF  THE  MATRIX  H 


The  discussion  in  Chapter  II  as  to  the  importance  of 
the  matrix  H  in  answering  an  experimenter's  questions  on 
estimahility  and  testability  can  be  applied  to  the  three 
examples  in  the  previous  chapter.  In  examining  these  examples, 
the  concept  of  confounding  can  be  mathematically  explained 
3J^d  illustrated.  Confounding  as  defined  by  References  1,  2, 

3»  5»  6  and  7  is  the  designing  or  arrangement  of  an  experi¬ 
ment  in  such  a  manner  that  certain  effects  cannot  be  distin¬ 
guished  from  other  effects.  The  references  previously  cited’ 
discuss  the  methods  for  intentionally  confounding  certain 
effects,  normally  the  higher  order  interaction  terms,  with 
other  effects  for  fractional  replications  in  several  different 
kinds  of  experimental  designs.  As  discussed  in  Chapter  II, 
the  linear  combination  of  the  parameters  q’b  is  estimable  if 
and  only  if  q’H  =  q' .  As  stated  before,  a  hypothesis  is 
testable  if  it  consists  of  estimable  functions;  so  if  the 
hypothesis  q'b  =  0  is  "tested"  with  the  standard  AN OVA  app¬ 

roach,  and  q’b  is  not  estimable,  the  hypothesis  actually  tested 
is  in  the  form  :  q’Hb  =  0.  From  the  expression  q’Hb  the 
analyst  can  determine  which  effects  are  confounded  in  the 
design.  This  determination  of  confounding  can  be  illustrated 
through  examples. 

In  the  first  example,  the  matrix  H  was  determined  to  be 
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H  = 


Consider  the  hypothesis  -  a.^  -  o  which  implies 

*1  -  ( 1  “l)  •  For  q'b  to  be  estimable  and  thus 


^o*  ^  ^  testable,  the  condition  q'H  =  q'  must  be  met. 

/  0  0  0  \ 


q’H  =(01  -1) 


110 

\1  0  1 


=(01  -1)  =  q' . 


Therefore,  the  hypothesis  a^  -  a2  =  0  is  testable.  Consi¬ 
der  a  different  hypothesis,  say  H^;a^  =  0.  Then  q’  =  (o  1  O) 
Checking  for  estimabilityj 


/o  0  o\ 


q'H  =  (0  1  0) 


110 
10  1 


=  (1  1  1)  7^  q'  . 


Therefore,  the  H^;a^  =  0  is  not  testable  but  the  hypothesis 
tested  actually  is  of  the  form  H^;  q'Hb  =  0  or  H^ :  m  +  a^  =  0 


This  result  means  that  the  a^  effect  is  confounded  with  the 
mean. 

In  the  second  example,  H  was  given  by  equation  (1). 
Checking  the  testability  of  the  hypothesis  H^:  ~  ^2  ~  ^ 

where 


q'  =  (0  1  -1  0  0  0  0  0  0 


0  0)  , 
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q'H  does  equal  q' ;  therefore,  the  hypothesis  q’h  =  0  is 
testable.  Suppose  that  the  hypothesis  -  Cg  =  0, 

where  q'  =  (O  0  0  1  -10  00  0  0  0  O) ,  is  tested. 

In  this  case  q’H  is  not  equal  to  q’ ,  and  therefore  the 
hypothesis  is  not  testable.  If  the  hypothesis  -  Cg  =  0 

is  tested  using  standard  ANOVA  test  statistics,  the  actual 
hypothesis  tested  would  be: 

:  q'Hb  =  ( 0  0  0  1  -1  0  i  0  i  O)  b  = 

Confounding  obviously  is  present  in  the  fom  of  the  two  way 

interaction  terms  ac- . .  The  analyst  may  suspect  that  the 

^  J 

interaction  terms  are  not  significant,  in  which  case  they 
can  be  disregarded.  In  the  present  case,  if  ^  ac^^^  =  0 

for  j  =  1,  2  is  assumed  as  part  of  the  model,  again  is 
testable. 

The  -I  replication  of  the  2^  factorial  design  produces 
confounding  of  many  effects  which  can  be  illustrated  for  the 
case  of  "testing"  the  hypothesis  a^  -  -  0*  For  this 

example  q’  is  1  x  2?  and  is  of  the  form: 

q’  =  (0  1  -1  0  ...  0) . 

It  is  easily  verified  that  q'H  7^  q,  implying  that  q'b  is  not 
estimable,  and  therefore  the  hypothesis  H^:  a^  -  ag  =  0  is 
not  testable.  By  computing  q'Hb,  the  form  of  the  confounding 
can  be  obtained: 
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HlW 


q'Hb  =  +  2  i  ^^^122  ~  ^  ^^^212  ~  ^  ^^^221 

—  "2  ”*  ^  *^^31  ^  *^*^33  ^  ^^11 

+  2  ®-'^13  "  2  ^^21  "  ^  ®’^31  "  ^  ^•‘^33  2  ^^11 

+  |-  aCj^2  ”  2  ^^33  “  *^' 
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V •  CONCLUSIONS  AND  RECOMMENDATIONS 


The  matrix  H,  once  computed,  is  valuable  to  the  analyst 
for  determining  the  estimable  linear  functions  of  the  para¬ 
meter  vector  and  for  ascertaining  which  hypotheses  are  test¬ 
able.  As  discussed  in  the  previous  chapter,  information 
concerning  confounding  can  be  gained  through  H. 

Of  particular  note  is  the  fact  that  the  computation  of 
the  matrix  H  is  dependent  on  the  design  matrix  X.  Although 
an  experiment  may  not  have  been  conducted  in  accordance  with 
original  design,  the  design  matrix  X  will  be  known.  The 
methods  of  computing  H  discussed  in  this  thesis  have  limit¬ 
ations.  The  direct  mathematical  approach  is  limited  by  the 
constraints  of  the  computer  routines  used  to  compute  X'X,  the 
generalized  inverse  matrix  G  of  X'X,  and  thus  H.  The  second 
method  using  the  analysis  of  variance  routine  with  its 
dimensionality  constraints  requires  knowledge  in  the  reduc¬ 
tion  in  dimensionality  of  the  design  matrix  and  the  row 
expansion  of  H,  but  this  knowledge  would  be  required  to  use 
the  ANOVA  routine  anyway.  The  second  method  is  constrained 
by  the  fact  that  the  ANOVA  routine  will  not  run  for  a  singu¬ 
lar  "reduced"  X'X;  this  constraint  led  to  a  modification  of 
the  second  method.  The  limitations  of  the  modified  method 
also  would  be  in  the  form  of  computer  program  constraints  in 
computing  matrix  inverses,  solving  systems  of  linear  equations 
to  expand  the  columns,  and  expanding  the  rows  using  the 
linear  restrictions  (equation  (12))  imposed  by  BMD05V . 
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A  special  purpose  computer  program  to  develop  H  could 
be  written.  A  desired  computer  routine  would  be  one  that 
would  accept  as  input  the  reduced  design  matrix  and  a  set 
of  q  s  and  produce  as  output  the  corresponding  matrix  H, 

/N/ 

and  q'H's.  Although  such  a  computer  program  might  have  some 
of  the  limitations  discussed  previously,  the  program  could 
answer  the  user's  questions  concerning  estimability  and 
testability.  As  discussed  in  the  previous  chapter,  the  user 
could  gain  information  as  to  which  effects  were  confounded, 
and  this  in  turn  would  aid  in  making  decisions  as  to  the 
significance  of  this  confounding  in  his  experiment. 
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